Skip to contents

ColOpenData can be used to access open geospatial data from Colombia. This data is retrieved from the Geostatistical National Framework (MGN), published by the National Administrative Department of Statistics (DANE). The MGN contains the political-administrative division and is used to reference census statistical information. Further information can be obtained directly from DANE here.

This package contains the 2018’s version of the MGN, which also included a summarized version of the National Population and Dwelling Census (CNPV) in different aggregation levels. Each level is stored in a different dataset, which can be retrieved using the download_geospatial function, which requires three arguments:

  • dataset character with the geospatial dataset name.
  • include_geom logical for including (or not) geometry. Default is TRUE.
  • include_cnpv logical for including (or not) CNPV demographic and socioeconomic information. Default is TRUE.

To better understand dataset names and details please refer to Documentation and Dictionaries.

Details for geospatial datasets relate to the level of aggregation as follows:

Code Level
DANE_MGN_2018_DPTO Department
DANE_MGN_2018_MPIO Municipality
DANE_MGN_2018_MPIOCL Municipality including Class
DANE_MGN_2018_MZN Block
DANE_MGN_2018_SECR Rural Sector
DANE_MGN_2018_SECU Urban Sector
DANE_MGN_2018_SETR Rural Section
DANE_MGN_2018_SETU Urban Section
DANE_MGN_2018_ZU Urban Zone

In this vignette you will learn:

  1. How to download geospatial data using ColOpenData
  2. How to use census data included in geospatial datasets
  3. How to visualize spatial data using leaflet and ggplot2

We will be using geospatial data at the level of Municipality (MPIO) for the department of Tolima and we will calculate the percentage of houses with internet connection at each municipality. Later, we will build some plots using the previously mentioned approaches for dynamic and static plots.

We will start by importing the needed libraries.

Disclaimer: all data is loaded to the environment in the user’s R session, but is not downloaded to user’s computer. Spatial datasets can be very long and might take a while to be loaded in the environment

Downloading geospatial data

First, we download the data using the function download_geospatial, including the geometries and the census related information.

mpio <- download_geospatial(
  dataset = "DANE_MGN_2018_MPIO",
  include_geom = TRUE,
  include_cnpv = TRUE
)
#> Original data is retrieved from the National Administrative Department
#> of Statistics (Departamento Administrativo Nacional de Estadística -
#> DANE).
#> Reformatted by package authors.
#> Stored by Universidad de Los Andes under the Epiverse TRACE iniative.
head(mpio)
#> Simple feature collection with 6 features and 90 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -76.1027 ymin: 0.9764735 xmax: -74.89527 ymax: 2.326755
#> Geodetic CRS:  WGS 84
#>   DPTO_CCDGO MPIO_CCDGO             MPIO_CNMBR MPIO_CDPMP VERSION       AREA
#> 1         18        001              FLORENCIA      18001    2018 2547637532
#> 2         18        029                ALBANIA      18029    2018  414122070
#> 3         18        094 BELÉN DE LOS ANDAQUÍES      18094    2018 1191618572
#> 4         18        247            EL DONCELLO      18247    2018 1106076151
#> 5         18        256              EL PAUJÍL      18256    2018 1234734145
#> 6         18        410           LA MONTAÑITA      18410    2018 1701061430
#>    LATITUD  LONGITUD STCTNENCUE STP3_1_SI STP3_2_NO STP3A_RI STP3B_TCN
#> 1 1.749139 -75.55824      71877        32     71845       32         0
#> 2 1.227865 -75.88233       2825        24      2801       24         0
#> 3 1.500923 -75.87565       4243        54      4189       54         0
#> 4 1.791386 -75.19394       8809         0      8809        0         0
#> 5 1.617746 -75.23404       5795         0      5795        0         0
#> 6 1.302860 -75.23573       5113        15      5098       15         0
#>   STP4_1_SI STP4_2_NO STP9_1_USO STP9_2_USO STP9_3_USO STP9_4_USO STP9_2_1_M
#> 1         0     71877      61176       2178       8436         87         39
#> 2         0      2825       1826         49        948          2          3
#> 3         1      4242       3223        109        900         11          4
#> 4         0      8809       6598        357       1850          4         11
#> 5         0      5795       4891        204        695          5          4
#> 6         0      5113       4077        241        786          9          2
#>   STP9_2_2_M STP9_2_3_M STP9_2_4_M STP9_2_9_M STP9_3_1_N STP9_3_2_N STP9_3_3_N
#> 1       1550        566         18          5         54       2591       1061
#> 2         34         12          0          0          3         21         32
#> 3         99          6          0          0          8         88          6
#> 4        259         87          0          0         21        334        124
#> 5        161         38          1          0          5        239        104
#> 6        205         32          2          0          3        103        123
#>   STP9_3_4_N STP9_3_5_N STP9_3_6_N STP9_3_7_N STP9_3_8_N STP9_3_9_N STP9_3_10
#> 1        535        368       3172        233          7         19       371
#> 2        728         53         92          5          0          0        14
#> 3          2         61        626          8          0          8        93
#> 4        807         89        361         42          0         27        39
#> 5          4         70        211         16          0          0        45
#> 6         17         84        362         30          0          0        58
#>   STP9_3_99 STVIVIENDA STP14_1_TI STP14_2_TI STP14_3_TI STP14_4_TI STP14_5_TI
#> 1        25      63354      47817      13764       1624         21          8
#> 2         0       1875       1793         40         22         17          0
#> 3         0       3332       3189        113         24          2          2
#> 4         6       6955       6006        775        160          1          1
#> 5         1       5095       4700        145        188          4          1
#> 6         6       4318       3890        224        161          6          2
#>   STP14_6_TI STP15_1_OC STP15_2_OC STP15_3_OC STP15_4_OC TSP16_HOG STP19_EC_1
#> 1        120      49809       2681       2150       8714     51430      48638
#> 2          3       1409         13         55        398      1559       1300
#> 3          2       2883          2        107        340      3161       2595
#> 4         12       5767        304        388        496      6129       5375
#> 5         57       4568          5        323        199      5848       4195
#> 6         35       3553        151        308        306      3748       2159
#>   STP19_ES_2 STP19_EE_1 STP19_EE_2 STP19_EE_3 STP19_EE_4 STP19_EE_5 STP19_EE_6
#> 1       1171      34851      10343       2169        509         13          3
#> 2        109       1184        106          1          0          0          0
#> 3        288       2118        366         17          2          1          0
#> 4        392       3548        962        793          1          1          1
#> 5        373       3330        770         84          0          1          1
#> 6       1394       1964        144          9          1          0          0
#>   STP19_EE_9 STP19_ACU1 STP19_ACU2 STP19_ALC1 STP19_ALC2 STP19_GAS1 STP19_GAS2
#> 1        750      45179       4630      41138       8671      37028      12074
#> 2          9        808        601        703        706         26       1371
#> 3         91       2017        866       1806       1077         52       2796
#> 4         69       4175       1592       4323       1444         57       5549
#> 5          9       2505       2063       2359       2209       1463       3041
#> 6         41       1441       2112       1329       2224         67       3454
#>   STP19_GAS9 STP19_REC1 STP19_REC2 STP19_INT1 STP19_INT2 STP19_INT9 STP27_PERS
#> 1        707      45491       4318      13362      35727        720     156789
#> 2         12        727        682         27       1370         12       4514
#> 3         35       1905        978         73       2775         35       9075
#> 4        161       4348       1419        211       5395        161      17775
#> 5         64       2414       2154        125       4379         64      13014
#> 6         32       1273       2280         64       3457         32      12128
#>   STPERSON_L STPERSON_S STP32_1_SE STP32_2_SE STP34_1_ED STP34_2_ED STP34_3_ED
#> 1       4315     152474      77620      79169      25503      30249      29951
#> 2        151       4363       2323       2191        725       1016        717
#> 3        346       8729       4551       4524       1592       2254       1388
#> 4        203      17572       8790       8985       3047       3811       2601
#> 5        192      12822       6601       6413       2346       2882       2170
#> 6        604      11524       6437       5691       2229       3022       1836
#>   STP34_4_ED STP34_5_ED STP34_6_ED STP34_7_ED STP34_8_ED STP34_9_ED STP51_PRIM
#> 1      23602      17235      14349       8969       4687       2244      37918
#> 2        568        536        445        253        162         92       1696
#> 3       1121        986        816        487        286        145       2596
#> 4       2302       2032       1792       1135        707        348       6091
#> 5       1587       1460       1188        703        430        248       4805
#> 6       1563       1441       1010        578        323        126       5011
#>   STP51_SECU STP51_SUPE STP51_POST STP51_13_E STP51_99_E Shape_Leng Shape_Area
#> 1      14123      14606        856       5892       3799   2.942508 0.20692777
#> 2        150         98          0        215         46   1.112829 0.03361758
#> 3        418        171         12        720        123   2.234657 0.09674460
#> 4        712        347         26       1095        171   3.154370 0.08986744
#> 5        261        226          0        916         99   3.529316 0.10030928
#> 6        384        134          0        724        182   3.402939 0.13817351
#>                             geom
#> 1 MULTIPOLYGON (((-75.42074 2...
#> 2 MULTIPOLYGON (((-75.89506 1...
#> 3 MULTIPOLYGON (((-75.78705 1...
#> 4 MULTIPOLYGON (((-75.36167 2...
#> 5 MULTIPOLYGON (((-75.36638 2...
#> 6 MULTIPOLYGON (((-75.40346 1...

After downloading, we have to filter by the municipality code using the DIVIPOLA code for Tolima. For further details on DIVIPOLA codification and functions please refer to Documentation and Dictionaries

divipola_department_code("TOLIMA")
#> [1] "73"

To understand which column contains the departments’ codes and filter for Tolima, we will need the corresponding dataset dictionary. To download the dictionary we can use the dictionary function. This function uses the dataset name to download the associated information. For further information please refer to the documentation on dictionaries previously mentioned.

dict <- dictionary("DANE_MGN_2018_MPIO")

head(dict)
#>     variable         tipo longitud
#> 1 DPTO_CCDGO         Text        2
#> 2 MPIO_CCDGO         Text        3
#> 3 MPIO_CNMBR         Text      250
#> 4 MPIO_CDPMP         Text        5
#> 5    VERSION Long Integer       NA
#> 6       AREA       Double       NA
#>                                                                                     descripcion
#> 1                                                                       Código del departamento
#> 2                                                            Código que identifica al municipio
#> 3                                                                          Nombre del municipio
#> 4                                                Código concatenado que identifica al municipio
#> 5                                                              Año de la información geográfica
#> 6 Área del municipio en metros cuadrados  (Sistema de coordenadas planas MAGNA_Colombia_Bogota)
#>   categoria_original
#> 1               <NA>
#> 2               <NA>
#> 3               <NA>
#> 4               <NA>
#> 5               <NA>
#> 6               <NA>

After exploring the dictionary, we can identify the column that contains the individual municipality codes is DPTO_CCDGO. We will filter based on that column.

tolima <- mpio %>% filter(DPTO_CCDGO == "73")

To calculate the percentage of houses with internet connection, we will need to know the number of houses with internet connection and the total of houses in each SECU. From the dictionary we get that the number of houses with internet connection is STP19_INT1 and the total of houses is STVIVIENDA. We will calculate the percentage as follows:

tolima <- tolima %>% mutate(INT_PERC = round(STP19_INT1 / STVIVIENDA, 2))

Static plots (ggplot2)

ggplot2 can be used to generate static plots of spatial data by using the geometry geom_sf as follows:

ggplot(data = tolima) +
  geom_sf(mapping = aes(fill = INT_PERC), color = NA)

The generated plot by default uses a blue palette, which makes it hard to observe small differences in internet coverage across municipalities. Color palettes and themes can be defined for each plot using the aesthetic and scales, which can be consulted in the ggplot2 documentation. We will use a gradient with a two-color diverging palette, to make the differences more visible.

ggplot(data = tolima) +
  geom_sf(mapping = aes(fill = INT_PERC), color = NA) +
  theme_minimal() +
  theme(
    panel.grid = element_blank(),
    axis.text = element_blank(),
    axis.ticks = element_blank()
  ) +
  scale_fill_gradient("Percentage", low = "#10bed2", high = "#deff00") +
  ggtitle(
    label = "Internet coverage",
    subtitle = "Tolima, Colombia"
  )

Dynamic plots (leaflet)

For dynamic plots, we can use leaflet, which is an open-source library for interactive maps. To create the same plot we first will create the color palette.

colfunc <- colorRampPalette(c("#10bed2", "#deff00"))
pal <- colorNumeric(
  palette = colfunc(100),
  domain = tolima$INT_PERC
)

With the previous color palette we can generate the interactive plot. The package also includes open source maps for the base map like OpenStreetMap and CartoDB. For further details on leaflet, please refer to the package’s documentation.

leaflet(tolima) %>%
  addProviderTiles(providers$CartoDB.Positron) %>%
  addPolygons(
    stroke = TRUE,
    weight = 0,
    color = NA,
    fillColor = ~ pal(tolima$INT_PERC),
    fillOpacity = 1,
    popup = paste0(tolima$INT_PERC)
  ) %>%
  addLegend(
    position = "bottomright",
    pal = pal,
    values = ~ tolima$INT_PERC,
    opacity = 1,
    title = "Internet Coverage"
  )